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FOREWORD 

The USAF Extension Course Institute, 
with hundreds of courses^ and thousands of 
examinations, is in an excellent position 
to apply sophisticated techniques in its 
evaluation program. One such technique is 
described here — a program to estimate 
failure rates;, , and reliability prior to test 
administratibfi'. 

Since the field -testing and refinement 
of so many instruments is a luxury beyond our 
means, predictive measures of difficulty and 
reliability are necessary tools of test develop- 
ment and evaluation; Mr. Vergil Mcintosh, 
of the EGI Evaluation and Research Division, has 
developed predictive measures that meet our needs 
admirably in this area. 

^ . i 

This report on the programs- he has developed 

has been published*' in the thought that other 

educational institutions, both military and" civili 

can benefit from our findings. The' comments of 

users would be appreciated. 

HAROLd ^fcvRKOWlTZ, JR., Lt'Col , USAF 

Chief, Evaluation and Research Division ^--^ 
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ESTIMATING EXAMINATION FAILURE RATES AND RELIABILITY 
PRIOR TO ADMINISTRATION 




Section A - .Introduction '''^ j 

Problem ; 

. Becau'se^of the requirement to place examinations in use 
before pre- testing, the Extension Course Institute (ECI) 
some^times finds that examinations are tob difficult or thexr 
reliability is not high enough to be acceptable.^ Thter^/ore, 
a procedure is needed to accurately predict thes- test statis- 
tics before \the test is acti\)^ated. 

i^v ' . ' .1 * ^ ' 

To meet these needs a system l{as been devised afftd evaluated 
to estimate* test statistics by majting an estimate of the ease 
and discrimination ;^index fojr each item. The procedure was » 
tried> compared with actual/' statistical^analyses, and found, 
m nearly all cases # to give close approximations. 

' V / , 

The procedure was first computed manually using a work- 
sheet and a normal curve probability table. A computer program 
was later developed wllich makes the computations and prints 
out a report in approximately one minuteV Both the manual and 
the computer procedures are described. in the following sections, 
■ ' ^ ^- . 

/ ~ Sedtion,B - Procedures 

Statistical Formulas : * T • 0« ^ 

In order to f oll£>w the r^tiona^i for the procedure it is 
necessary to consider the statistical formulas involved in the 
present statistical analysis of examinations. These formulas 
are: 

Reliability - Kuder-Richjrdson Formula 21. ' 

2 



f M(n^M) / 
a' (n-1) [ 



Where: n = th6 number of items on the examination; o a the 
standard deviation of scores; M = mean of examination scores. 



1. Internal standards 'Refine an unacceptable examination as one 
having a failure irate in excess of 35% and/or a reliability 
coefficient ~of lis^s than 



standard Deviation ( a ) = 



V 



/ E x2 



n 



where^ x = ajiy deviation from the mean; x = 'Sum of the 
squari^d deviations; N, number of cases. 

Since we do nob have all of the data available tc5""substi- 
tute in the above formulas until a Scunple of student solutions, 
has been received , it is obvious that we must make some esti- 
mate's. ]^ 

Ebel gives a formula which can be used to estimate 
the variance of the scores on a test. It is expressed as: . 

s- • ' ■ . 

2 _ ( 5:d)2 



D is the sum of the , indies of discrimination for -a 



where: 
test. 

In using this formula tto Pj|*sfej5^i^^ of a Sample 
of ECI tests it was found thSt/|^;|^a^ce can be predicted best 

by using a divisor of about ki'^ ^^^^^^ XBSiSon f6r this^dif fference 

is not knovgn, but ^el may ha'^q •Uf^'^ V different formula for 
computing discrimination indexes. v 

In..j9r!^er to estamate the failure -rater it is necessary 
to erompute the ^ea under the normal probability curve falling 
below the fail score. This: dan' be com|>uted by determining the ^ 

en the mean aiM the fail 



diffetence .iri g^tandard deyiation'Sijybetwe 
point by the formiila: [ ■ . f' 



SD diff 



where: x is the difference in score units betweeh th^e^^an and 
the fail score; and. cr is the standard deviationrj of the sqores. 
By referring to a table of the fractional parts— <if the area 
under the normal probability curve , the parcJfent of scores falling 
between the mean and fail point can be deterfllined (e.g. Table A 
p 4^8 in Garrett Statistics in Psychology and Education ) . 



2. R.yE. Ebel, Essentials of Educ a 1 1 o n a 1 Measureitiieti t Englewood 
Gliff^, NJ: Prentice-Hall, 1972, p. 399-401. 



Subtracting this value from 50 percent results in the percent 
of estimated failures for the examination. This / of* course 
assumes student scores approximate a normal distribution. In 
using this procedure With a group of ECI courses, it Was found 
that the predications were generally close to the actual failure 
rate. 

Manual Coinputations ; 

\ 

The steps in estimating the examination ^ statistics are 
as follows: . ' 

STEP 1: ESTIMATE THE EASE INDEX AND DISCRIMINATION INDEX 
FOR EACH ITEM IN THE ITEM BANK , This step is done by the test 
constructor as he checks the item pool. If the items have been 
used on previous examinations , the item analyses statistics can 
provide a good basis for estimating the expected performance of 
each item, ^ Estimates for individual items may not have a high 
degree of accuracy; however , when averages for all items are 
computed, the estimated and actual performance ought not differ 
greatly. This generalization is drawn from the known fact that 
a number of estimates when averaged will be very close to the 
true value. This step can be refined and the accuracy improved . 
through (a) preparing guidelines for making estimates^ (bj col- 
lecting and analyzing data on estimates^ and (c) holding in- 
service training on making estimates for test constructors. 

STEP 2: SELECT ITEMS FOR THE TWO PARALLEL COURSE EXAMINA- 
TIONS (CE) FORMS ANP COMPUTE THE AVERAGES OF ITEM DlgCRIMINATlON 
INDEXES AND THE ITEM EASE INDEXES . A worksheet (see figure 1) 
has been devised to assist in making the computations, 

STEP 3: COMPUTE THE VARIANCE .(a^ ) AND STANDARD DEVIATION 
( ^ ) > See page 2. ' 

STEP 4: COMPUTE THE MEAN (M) OF THE RAW SCORES . M equals 
the number of items on the examination time^ the average item 
eas e • ' 

V 

STEP 5: COMPUTE THE FAIL POINT , Fail point = • 60 x the 
number of items on the examination, 4 y 

STEP 6: SUBTRACT THE FAIL POINT FROM THE MEAN AND DIVIDE 
THE DIFFERENCE BY THE STANDARD DEVIATIDN . This gives the differ- 
ence in tetms of standard deviation units . 



4. Internal standards mandate this fail point which is based 
on Air Training Command resident school standards. 
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Tor Cooinatin:: teol. failv.ro ratoc cwA roliabillty 



\ 

A. '•umber of i'^en-: on the ercar^.i nation • • . ' 

B« 3ufii of Di5:cri Miration lndcx§e» ^ 

C. Ilean of Discrl'-lnation Incle:^os. [b/a] 

D. 3am of Iteo* Kasc Indexes. • . . . 

E. i ican of Kase Indexes Q?/l] • • • • • • • • • 

F. . ];oan of Raw Scores 3 ' •. • • . • • 

G. Fail ocore [a x .6o] . • . • 

n.j 7.3t.inated Variance (o^^) [i^A:!] o . 

I, Fstimated Standard Deviation lIVh] 

J. Difference betvreen *'ean and Fail Score - 5] . . . 

Tiifferencc "0** in te^ms of Standard Deviatiorjs [j/f] o 

L« Percent of Scores betv/eeh Tean and Fail Point 

(!iefer to table of norr^al probability cur^'^e). . . o • 



Kstinated'Failvro Rate jT^O - l] 



Estir.ate the tent Reliability using Kuder-Richardsorx 
formula 21 



R = n 0-^ ~ K (n-M) 



cr 



2 



N. n X (T^ » £ X 2 = 

0. n - 1 : = [I - 3 = 

P. n (K-M) =[F x:Oj'^ . . 

Q. The numsratOT =(1; - p] " 

R. n - 1 = [a -1) = ... 

3. Tlic denoainator =[]rx R] = 

T. Reliability =. 



Fijj,ure 1. Worksheet for Computing Estimates Manually. 



Normal Curve 




4fean 

Jlf x/(r = 1,38, the area° hl,^% of the 
jarea belcn-r the mean, (see t^ible belcr./) 

_i ^Fail Pointj 

J Area below fail point '=« $0,00 - 1x1.68= 8.32?; wJiich 
l±s the percent of the students expected to, fail exain. 
X is the difference in ra>r score points betvreen 
the mean and the fail point. 



Fractional parts of the total area under the normal 
probability curve, corresponding to distances on the 
baseline between, the mean and successive points laid 
off from the mean in units of standard deviation 

Example: between the mean and a point 1.51a (- = 1.51) 
are found ^3.^5% 'of the entire area under the curve. 
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Figure 2. Normal Curve Probability Table. 
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STEP 7: DETERMINE THE AREA UNDER THE NORMAL DISTRIBU- 
TION CURVE BETWEEN THE MEAN AtfD THE FAIL POINT using the 
table, at Figure 2 . 

STEP 8: SUBTRACT THfi VALUE IN STEP 7 FROM 50, PERCENT , 
This value is the estimated failure rate. It assumes the 
distribution of student scores approxiniates . a normal dis- 
tribution. 

ft 

STEP 9: COMPUTE THE TEST RELIABILITY by substituting 
the appropriate ^ralues in the reliability for ?nula\ 



SECTION C - COMPUTER PROGRAM FOR COMPUTING ESTIMATES 

^ A comput'er program has been written in the BASIC language 
^bo^xpedite the computing process. The steps in the procedure 
are ag follows: 

STEP 1: Estimate the ease and: discrimination indexes for 
each i€em in the item bank. 

STEP 2: Input the item ease and discrimination indexes 
for the selected items into a disk file via a remote terminal. 
Do not use decimal points in inputting the data. 

STEP 3: Use. the ISE 2 ^computer program (see Figure 3) to 
compute the estimates and print out a report. In running the 
ISE 2 program, the file name, for the data file should be entered 
in line 060.' Line 070 should be Checked (listed) . to ^assure the 
read statement corresponds to data listed in the fil^. The • ^ 
value "y" will read the .ease index, and ''z" the discrimination / 
index. 

STEP 4: A report will be printed out on the remote terminal 
A sample report is shown in figure 4. 



Ji 

c 



10 (?EM***THIS PROGRAM ESTIMATES MEAMS. FA ILURE RATES, 

20 REM ANO RELIABILITY*** 
f-30 REM ***DArA IS ENTERED FROM A FILE*** 

40 PRINT "ENTER COURSPAND FORM NUMf3ER^' 

50' INPUT CI ,C2 

60 FILES 63I505B 
, 70 FILES NRMCRV I ^ 

80 READ #1 ,X,Y,Z,^ 

90 N=N+I 

100 t?S+Y 

I I 0 1)=THZ 

120 IF MORE #1 THEN 80 

130 REM***COMPUTE AVG EASE*** 

I 40 h=e/(n* I 00) 

150 REM*** COMPtifE AVG ITEM DISC*** 
160 <7=(d/l 00)/(n) ■ 

170 REM***COMPlJTE MEAN OF RAW SCORES*** 
180 r=n*h 
190 q=n*.60 

200 REM***COMPUTE VARIANCE*** 
'210 ^=4.5 
'^220 Y=(d/1 00)''2 

230 v=( (d/l 00) '"2) /(a ) 

240 s=v''.5 

250 rem cnmpute diff mean and fail ot in sd • ' 
" 260 o=.(r-q )/s 
270 print "diff mean and fp^ in sd=",o • • 

280 0=(()*IO) + .5 \0=INT(()-) 

290 READ #2,E,F - • ' " 

300 IF EoO a)T0290 ' - .. 

310 r=.50 -F , 
■320 k = (n*v-.r*(n-r ) )/(W(n-l ) ) 
330 PRINT\PRINT\PRINT 

340 print tab( 1 4 ) ; "COURSE EXAMINATION STATISTICAL ESTIMATES^' 

35Q PRINT TA3( I6);"C0URSE^';CI ' -"FORM^' ;C2 ; SPC( I 0 ) ^'DATE^' ; SPC(2)»DAT$ 

360 -orint usina 370, n 

370» NR ITEMS= . ### 

380 print usina 390, h I ' 

390» AVG EASE=' .### ' 

400 print using 4 10, q . " ^ 

4101 AVG ITEM DISC= . ### ■ . 

420 print using 430, r ■ ' 

4 30« MEAN= 

440 print using 450, s 

450« SrANDAR(3 DEVIATI0N= ##. ## ' • ' • 

460 print using 470, a 

4 70* PASS/FAIL PO I NT= ##.# 

480 Drint using 490, t - ■ 

4901 EST FAILURE RATE= ' ^ . ## 

500 print usinq 5 10, l< . 

510» RELIABILITY= #.### ^• 

520 Drint . , T 

530 END 

F^IGURE 3. ISE PROGRAM TO COMPUTE EXAMINATION STATISTICAL ESTIMATE 
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• ^cquitst-: 63150 form; 25 date . 10/06/ ys 

'4 ^' -1 o .■ 



.^m -.t:AS'L=: - 

v^'vfo, I i |JISC= , . ■ . .^-'^ 1 ■ - . 

'.MbAN= V ^'1' 

■.'SiANLMini DbVT ArT()N= • . 9.53 • _ 

tHA S5/r A I L PO I rJ4^. . 43.2 f 

^^'bSl- HAIL'iyi-: RAi» .20 t! 

KhL.TABiLT i Y= , - " • '-^^^ /■ 

'Figure I4. Printout of a Statistical Report. 
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Section D - Conclusions 



Findings : 



Comparisons were made between the estimates for several 
courses and item analyses based on samples of student test 
■papers. The^ rfesults showed generally close agreement. 
Differences were approximately of the same magnitude as 
differences foiand between two different ^analyses. Figure 5 
is a table comparing estimates with student samples of 51 and 
201. Zeros on the table indicate that data are not available. 
. ("CRSE" and "FM" indicate ECI course and examination form 
number . 
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.91 

.90 

.89 

.88 , 

.89 

.90 



. 64 

.93 

.91 

. 91 

. 85 

. 81 

.86 

.91 

.86 

,-7« 

.89 

,90 

83 
,82 

^2 
,86 

74 



.20J. 
. Of) 
.92 
.9 1 
.90 
. •-i.'-i 
.85 
. 8H 
.f)0 
. 00 
.00 

. H<!1 
.84 
.00 
. 00 
.8^ 



I'ipure 5. Comparison of Fstimated Statistics with Analysis of Student Samples, 



Although estimates are generally close to the actual analyses/- 
it is likely that some refinements Cein be made to the procedure, 
and guidelines can be prepared to assist test constructors in 
making estimates, and thus improve thes*e estimates. 



I 



14 



ERIC 



Significant advantages to be realized from using the 
estimating procedure are that it will (1) help assure that 
different forms of the SEs are equivalent/ (2) reduce the _f 
number of CEs with excessive failure rates or low relia- 
bility and (3) require test constructors to carefully / 
evaluate an item's function in a test. This will result in 
distinct improvement in test quality. .y^\^ 

Summary : - /// ' 

A system"Ti^^B»nr developed to estimate examination 
statistics be^^ ^^^^ ^ examination has been administered. The 
system requir^^fSHK^est constructor to make an estimate of 
the- ease indek 'ani^i^ index for each item. These 

indexes are thtti xiSed to compute test estimates using the 
worksheet or t]>fe computer program. Based on samples of 
actual student data- the system has been found to provide rela 
tively close estimates of test performance. 
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